In the next paragraphs, one should note that the graphical
representations report probabilities of correct repetition of a nonword,
i.e., in the ‘numerical space’ of the predicted variable. However, the
statistical tests of the various effects consider the non-transformed
estimates and confidence intervals, i.e. in the numerical space of the
linear combination of predictors / independent variables.
Investigating the
interactions
We can first investigate the higher-level predictors of the model,
that is the different interactions.
- occurrence_l * LoE
- branching_onset * LoE
- rec_voc * LoE
- age * phono_mem
- age * rec_voc
- phono_mem * rec_voc
The first two interactions are between one categorical variable and
the continuous variable LoE
(occurrence_l : LoE and
branching_onset : LoE)
We first define a range of variation for the values of LoE
list.LoE <- list(LoE = seq(min(df_reduced$LoE), max(df_reduced$LoE), by = 1))
For occurrence_l : LoE:
plot_model(model_rep, type = "emm", terms = c("LoE [all]", "occurrence_l"))

emtrends(model_rep, pairwise ~ occurrence_l | LoE, var= "LoE", adjust = "mvt", infer = c(T,T))
$emtrends
LoE = 184:
occurrence_l LoE.trend SE df asymp.LCL asymp.UCL z.ratio p.value
coda 0.001585 0.00238 Inf -0.003089 0.00626 0.665 0.5063
final 0.003258 0.00204 Inf -0.000744 0.00726 1.595 0.1106
other 0.000621 0.00141 Inf -0.002143 0.00339 0.440 0.6597
Results are averaged over the levels of: branching_onset, V
Confidence level used: 0.95
$contrasts
LoE = 184:
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
coda - final -0.001673 0.00244 Inf -0.00735 0.00401 -0.686 0.7663
coda - other 0.000964 0.00206 Inf -0.00384 0.00577 0.467 0.8838
final - other 0.002637 0.00165 Inf -0.00120 0.00647 1.600 0.2388
Results are averaged over the levels of: branching_onset, V
Confidence level used: 0.95
Conf-level adjustment: mvt method for 3 estimates
P value adjustment: mvt method for 3 tests
The figures show that the slopes for the levels
other and coda are quite close, and
are visually quite different from the slope for the third level
final. The statistical tests, however, fail to detect a
significant difference. The reason for that is possibly the large
standard errors for coda and final,
which are due in turn to the small number of nonwords which relate to
these two levels (4 and 7 respectively, to be compared to 60 nonwords
without l in coda or final position):
my_table <- df_reduced %>%
select(nonword, occurrence_l) %>%
unique() %>%
with(., table(occurrence_l)) %>%
as.data.frame()
colnames(my_table) <- c("Occurrence of l", "# nonwords")
my_table
Occurrence of l # nonwords
1 coda 4
2 final 7
3 other 60
For branching_onset : LoE:
plot_model(model_rep, type = "emm", terms = c("LoE [all]", "branching_onset"))

emtrends(model_rep, pairwise ~ branching_onset | LoE, var= "LoE", adjust = "mvt", infer = c(T,T))
$emtrends
LoE = 184:
branching_onset LoE.trend SE df asymp.LCL asymp.UCL z.ratio p.value
0 0.002481 0.00138 Inf -0.000227 0.00519 1.796 0.0725
1 0.000619 0.00156 Inf -0.002440 0.00368 0.397 0.6917
2 0.002364 0.00290 Inf -0.003311 0.00804 0.817 0.4142
Results are averaged over the levels of: occurrence_l, V
Confidence level used: 0.95
$contrasts
LoE = 184:
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
branching_onset0 - branching_onset1 0.001863 0.00115 Inf -0.00079 0.00452 1.618 0.2235
branching_onset0 - branching_onset2 0.000117 0.00268 Inf -0.00606 0.00629 0.044 0.9989
branching_onset1 - branching_onset2 -0.001746 0.00266 Inf -0.00788 0.00439 -0.656 0.7789
Results are averaged over the levels of: occurrence_l, V
Confidence level used: 0.95
Conf-level adjustment: mvt method for 3 estimates
P value adjustment: mvt method for 3 tests
While the figures show that the slopes for LoE
differ according to the value of branching_onset, the
statistical tests reveal that these slopes are actually not
significantly different from each other.
We then have 4 interactions which consist of two continuous variables
: rec_voc : LoE, age
: phono_mem, months :
rec_voc and phono_mem :
rec_voc
For rec_voc : LoE:
plot_model(model_rep, type = "emm", terms = c("LoE [all]", "rec_voc"))
We can see that the different curves are nearly parallel, which
corroborates the lack of significant interaction between the two
variables.
Knowing that the interaction between rec_voc and
LoE corresponds to the change in the slope of
LoE for every one unit increase in
rec_voc (or vice-versa), we can assess the significance
of this interaction by looking at the contrast / difference between two
slopes of rec_voc separated by one unit increase in
LoE (the result will be independent from the choice of
the two values separated by one unit).
list.LoE.red <- list(LoE = seq(median(df_reduced$LoE) - 0.5, median(df_reduced$LoE) + 0.5, by = 1))
emtrends(model_rep, pairwise ~ LoE, var = "rec_voc", at = list.LoE.red, adjust = "mvt", infer = c(T, T))
$emtrends
LoE rec_voc.trend SE df asymp.LCL asymp.UCL z.ratio p.value
184 0.0267 0.0203 Inf -0.013 0.0665 1.320 0.1869
184 0.0268 0.0203 Inf -0.013 0.0667 1.320 0.1870
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
$contrasts
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
LoE183.5 - LoE184.5 -0.000101 0.00021 Inf -0.000511 0.00031 -0.481 0.6309
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
We find that the estimate for the interaction is not significantly
different from 0 - the p-value is much larger than 0.05.
For age : phono_mem:
plot_model(model_rep, type = "emm", terms = c("age [all]", "phono_mem"))

list.phono_mem.red <- list(phono_mem = seq(median(df_reduced$phono_mem) - 0.5, median(df_reduced$phono_mem) + 0.5, by = 1))
emtrends(model_rep, pairwise ~ phono_mem, var = "age", at = list.phono_mem.red, adjust = "mvt", infer = c(T, T))
$emtrends
phono_mem age.trend SE df asymp.LCL asymp.UCL z.ratio p.value
3.5 0.0148 0.0109 Inf -0.00657 0.0361 1.357 0.1748
4.5 0.0176 0.0151 Inf -0.01200 0.0473 1.166 0.2434
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
$contrasts
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
phono_mem3.5 - phono_mem4.5 -0.00286 0.0125 Inf -0.0274 0.0217 -0.228 0.8193
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
Once again, the p-value is much larger than 0.05.
For age : rec_voc:
plot_model(model_rep, type = "emm", terms = c("age [all]", "rec_voc"))

list.rec_voc.red <- list(rec_voc = seq(median(df_reduced$rec_voc) - 0.5, median(df_reduced$rec_voc) + 0.5, by = 1))
emtrends(model_rep, pairwise ~ rec_voc, var = "age", at = list.rec_voc.red, adjust = "mvt", infer = c(T, T))
$emtrends
rec_voc age.trend SE df asymp.LCL asymp.UCL z.ratio p.value
20 0.0165 0.0111 Inf -0.00522 0.0382 1.488 0.1367
21 0.0150 0.0112 Inf -0.00698 0.0369 1.336 0.1816
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
$contrasts
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
rec_voc20 - rec_voc21 0.00152 0.00142 Inf -0.00125 0.0043 1.075 0.2826
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
The p-value for the interaction is higher than 0.05.
For phono_mem : rec_voc:
plot_model(model_rep, type = "emm", terms = c("phono_mem", "rec_voc"))

list.phono_mem.red <- list(phono_mem = seq(median(df_reduced$phono_mem) - 0.5, median(df_reduced$phono_mem) + 0.5, by = 1))
emtrends(model_rep, pairwise ~ phono_mem, var = "rec_voc", at = list.phono_mem.red, adjust = "mvt", infer = c(T, T))
$emtrends
phono_mem rec_voc.trend SE df asymp.LCL asymp.UCL z.ratio p.value
3.5 0.0207 0.0267 Inf -0.03161 0.0731 0.776 0.4376
4.5 0.0370 0.0239 Inf -0.00988 0.0839 1.547 0.1219
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
$contrasts
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
phono_mem3.5 - phono_mem4.5 -0.0163 0.0324 Inf -0.0799 0.0473 -0.502 0.6158
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
The p-value is once again much larger than 0.05.
Our investigation of the interactions there shows that none of them
is statistically significant. A possible option would then be to
simplify the model by dropping these interactions. However, this amounts
to model selection (for the fixed effects), which is warned against by a
number of prominent statisticians. In what follows, we are therefore
going to assess main effects despite the presence of interactions in
the model.
Investigating the
main effects
Given that none of the interactions we thought could be significant
appears to be so, we can focus on the main effects in our model,
i.e. the effects of the item-related categorical variables
occurrence_l, branching_onset, and
V, and the subject-related continous variables
rec_voc, phono_mem,
LoE, age,
L1_syll_complexity and
phono_awareness.
For occurrence_l:
plot_model(model_rep, type = "emm", terms = "occurrence_l")

summary(emmeans(model_rep, pairwise ~ occurrence_l, adjust = "mvt", side = "<"), infer = c(TRUE, TRUE), null = 0)$contrasts
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
coda - final -0.351 0.400 Inf -Inf 0.476 -0.877 0.4098
coda - other -1.364 0.336 Inf -Inf -0.670 -4.061 0.0001
final - other -1.013 0.257 Inf -Inf -0.483 -3.949 0.0001
Results are averaged over the levels of: branching_onset, V
Results are given on the log odds ratio (not the response) scale.
Confidence level used: 0.95
Conf-level adjustment: mvt method for 3 estimates
P value adjustment: mvt method for 3 tests
P values are left-tailed
We set the parameter side to reflect our hypothesis
that a nonword gets easier to repeat when shifting from l in coda
position to l in final position to another structure.
We find that:
- nonwords are significantly more difficult to repeat when l appears
in internal coda position than when there is no l (coda -
other, p < 0.0001).
- nonwords are significantly more difficult to repeat when l appears
in final position than when there is no l (final -
other, p < 0.0001)
- nonwords are not significantly more difficult to repeat when l
appears in internal coda position than when l appears in final position
(coda - final, p = 0.4101)
For branching_onset:
plot_model(model_rep, type = "emm", terms = "branching_onset")

summary(emmeans(model_rep, pairwise ~ branching_onset, adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)$contrasts
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
branching_onset0 - branching_onset1 0.812 0.172 Inf 0.4654 Inf 4.725 <.0001
branching_onset0 - branching_onset2 1.830 0.465 Inf 0.8928 Inf 3.939 0.0001
branching_onset1 - branching_onset2 1.018 0.471 Inf 0.0678 Inf 2.162 0.0358
Results are averaged over the levels of: occurrence_l, V
Results are given on the log odds ratio (not the response) scale.
Confidence level used: 0.95
Conf-level adjustment: mvt method for 3 estimates
P value adjustment: mvt method for 3 tests
P values are right-tailed
We set the parameter side to reflect our hypothesis
that the more branching onsets in a nonword, the more difficult it is to
repeat.
We observe that all the contrasts are significant, and that
therefore:
- A nonword is more difficult to repeat when it has 1 branching onset
than when it has none (p < 0.0001)
- A nonword is more difficult to repeat when it has 2 branching onsets
than when it has none (p < 0.0001)
- A nonword is more difficult to repeat when it has 2 branching onsets
than when it has 1 (p = 0.0355)
For V:
plot_model(model_rep, type = "emm", terms = "V")

summary(emmeans(model_rep, pairwise ~ V, adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)$contrasts
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
V1 - V2 0.101 0.199 Inf -0.3133 Inf 0.508 0.5926
V1 - V3 0.595 0.203 Inf 0.1735 Inf 2.939 0.0048
V2 - V3 0.494 0.201 Inf 0.0765 Inf 2.464 0.0190
Results are averaged over the levels of: occurrence_l, branching_onset
Results are given on the log odds ratio (not the response) scale.
Confidence level used: 0.95
Conf-level adjustment: mvt method for 3 estimates
P value adjustment: mvt method for 3 tests
P values are right-tailed
The parameter side corresponds to the hypothesis
that the less vowels in a nonword, the easier it is to
repeat.
We find two significant differences: a nonword with a single vowel is
easier to repeat than a nonword with 3 vowels (p = 0.0047), and
a nonword with 2 vowels is easier to repeat than a nonword with 3 vowels
(p = 0.0192)
For rec_voc:
plot_model(model_rep, type = "emm", terms = "rec_voc [all]")

summary(emtrends(model_rep, ~ rec_voc, var = "rec_voc", adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)
rec_voc rec_voc.trend SE df asymp.LCL asymp.UCL z.ratio p.value
20.4 0.0268 0.0203 Inf -0.00662 Inf 1.320 0.0935
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
P values are right-tailed
We set the parameter side to reflect our hypothesis
that the larger a child’s French receptive vocabulary, higher the
probability of correct repetition.
We observe only a tendency for rec_voc (p =
0.0935), suggesting that the size of a child’s receptive vocabulary
positively impacts her/his ability to correctly repeat nonwords.
For phono_mem:
plot_model(model_rep, type = "emm", terms = "phono_mem")

summary(emtrends(model_rep, ~ phono_mem, var = "phono_mem", adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)
phono_mem phono_mem.trend SE df asymp.LCL asymp.UCL z.ratio p.value
3.88 0.0521 0.126 Inf -0.155 Inf 0.415 0.3392
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
P values are right-tailed
We set the parameter side to reflect our hypothesis
that the larger a child’s phonological memory, the higher the
probability of correct repetition.
We do not observe a significant effect of
phono_mem.
For LoE:
plot_model(model_rep, type = "emm", terms = "LoE [all]")

summary(emtrends(model_rep, ~ LoE, var = "LoE", adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)
LoE LoE.trend SE df asymp.LCL asymp.UCL z.ratio p.value
184 0.00182 0.00159 Inf -0.000786 Inf 1.149 0.1253
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
P values are right-tailed
We set the parameter side to reflect our hypothesis
that the longer the exposure to French, the higher the probability of
correct repetition.
We do not observe a significant effect of LoE on the
probability of correct repetition.
For age:
plot_model(model_rep, type = "emm", terms = "age [all]")

summary(emtrends(model_rep, ~ age, var = "age", adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)
age age.trend SE df asymp.LCL asymp.UCL z.ratio p.value
90.5 0.0158 0.0111 Inf -0.00241 Inf 1.428 0.0766
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
P values are right-tailed
We set the parameter side to reflect our hypothesis
that the older a child is, the higher the probability of correct
repetition.
We do not observe a significant effect of age,
although we find a tendency suggesting that the older a child, the
higher the probability of correct repetition (p = 0.0766).
For L1_syll_complexity:
plot_model(model_rep, type = "emm", terms = "L1_syll_complexity [all]")

summary(emtrends(model_rep, ~ age, var = "L1_syll_complexity", adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)
age L1_syll_complexity.trend SE df asymp.LCL asymp.UCL z.ratio p.value
90.5 0.0389 0.115 Inf -0.151 Inf 0.337 0.3681
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
P values are right-tailed
We set the parameter side to reflect our hypothesis
that the higher the complexity of the syllables of the L1, the higher
the probability of correct repetition.
We do not observe a significant effect of
L1_syll_complexity (p = 0.3681).
For phono_awareness:
plot_model(model_rep, type = "emm", terms = "phono_awareness [all]")

summary(emtrends(model_rep, ~ phono_awareness, var = "phono_awareness", adjust = "mvt", side = ">"), infer = c(TRUE, TRUE), null = 0)
phono_awareness phono_awareness.trend SE df asymp.LCL asymp.UCL z.ratio p.value
-2.82 0.161 0.068 Inf 0.0494 Inf 2.371 0.0089
Results are averaged over the levels of: occurrence_l, branching_onset, V
Confidence level used: 0.95
P values are right-tailed
We set the parameter side to reflect our hypothesis
that the more developed a child’s phonological awareness, the higher the
probability of correct repetition.
We observe a significant effect of phono_awareness
(p = 0.0089): the more developed the child’s phonological
awareness, the better they are at correctly repeating the nonwords.
Summary for the
main effects
All our hypotheses were not confirmed. We found the following
results:
- nonwords are significantly more difficult to repeat when l appears
in internal coda position than when there is no l (p <
0.0001).
- nonwords are significantly more difficult to repeat when l appears
in final position than when there is no l (p < 0.0001).
- the more branching onsets nonwords have, the more difficult they are
to repeat (1 versus 0 BO: p < 0.0001; 2 versus 0 BO:
p < 0.0001; 2 versus 1 BO: p = 0.0355)
- nonwords with 3 vowels are more difficult to repeat than nonwords
with 1 or 2 vowels (p = 0.0047 and p = 0.0192,
respectively)
- the more developed a child’s phonological awareness, the easier for
them to repeat the nonwords (p < 0.0089)
Additionally, we observed two statistical tendencies:
- the larger a child’s French receptive vocabulary, the easier for
them to repeat the nonwords (p < 0.0935)
- the older a child, the easier for them to repeat the nonwords
(p = 0.0766)
We did not observe effects of phono_mem or of
LoE. We also did not find a significant difference
between l appearing in internal coda position in nonwords and l
appearing in final position. We finally did not find an effect of the
complexity of syllables in L1.
Our analysis overall proves to be quite simple, in the sense that we
did not observe any significant interaction.
p1 <- plot_model(model_rep, type = "emm", terms = "occurrence_l", show.values = TRUE, value.offset = .3, colors = "gs") +
labs(title = "Effect of occurence_l", x = "Location of /l/", y = "Predicted probability of correct repetition")
p2 <- plot_model(model_rep, type = "emm", terms = "branching_onset", show.values = TRUE, value.offset = .3, colors = "gs") +
labs(title = "Effect of branching_onset", x = "Number of branching onsets", y = "Predicted probability of correct repetition")
p3 <- plot_model(model_rep, type = "emm", terms = "V", show.values = TRUE, value.offset = .3, colors = "gs") +
labs(title = "Effect of V", x = "Number of vowels", y = "Predicted probability of correct repetition")
p4 <- plot_model(model_rep, type = "emm", terms = "rec_voc [all]", show.values = TRUE, value.offset = .3, colors = "gs") +
labs(title = "Effect of rec_voc", x = "Size of the receptive vocabulary", y = "Predicted probability of correct repetition")
p5 <- plot_model(model_rep, type = "emm", terms = "age [all]", show.values = TRUE, value.offset = .3, colors = "gs") +
labs(title = "Effect of age", x = "Age", y = "Predicted probability of correct repetition")
p6 <- plot_model(model_rep, type = "emm", terms = "phono_awareness [all]", show.values = TRUE, value.offset = .3, colors = "gs") +
labs(title = "Effect of phonological awareness", x = "Phonological awareness", y = "Predicted probability of correct repetition")
grid.arrange(p1, p2, p3, p4, p5, p6, ncol = 3)
